
Very Basic Lie Theory 
Author(s): Roger Howe 

Source: The American Mathematical Monthly , Yol. 90, No. 9 (Nov., 1983), pp. 600-623 
Published by: Mathematical Association of America 
Stable URL: http://www.jstor.org/stable/2323277 

Accessed: 15/04/2009 10:50 


Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at 
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless 
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you 
may use content in the JSTOR archive only for your personal, non-commercial use. 

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at 
http ://www.j stor.org/action/sho wPublisher?publisherCode=maa. 

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed 
page of such transmission. 

JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We work with the 
scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that 
promotes the discovery and use of these resources. For more information about JSTOR, please contact support@jstor.org. 


Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to 
The American Mathematical Monthly. 


STOR 


http ://www.j stor.org 





VERY BASIC LIE THEORY 


ROGER HOWE 

Department of Mathematics, Yale University, New Haven, CT 06520 

Lie theory, the theory of Lie groups, Lie algebras and their applications, is a fundamental part 
of mathematics. Since World War II it has been the focus of a burgeoning research effort, and is 
now seen to touch a tremendous spectrum of mathematical areas, including classical, differential, 
and algebraic geometry, topology, ordinary and partial differential equations, complex analysis 
(one and several variables), group and ring theory, number theory, and physics, from classical to 
quantum and relativistic. 

It is impossible in a short space to convey the full compass of the subject, but we will cite some 
examples. An early major success of Lie theory, occurring when the subject was still in its infancy, 
was to provide a systematic understanding of the relationship between Euclidean geometry and 
the newer geometries (hyperbolic non-Euclidean or Lobachevskian, Riemann’s elliptic geometry, 
and projective geometry) that had arisen in the 19th century. This led Felix Klein to enunciate his 
Erlanger Programm [Kl] for the systematic understanding of geometry. The principle of Klein’s 
program was that geometry should be understood as the study of quantities left invariant by the 
action of a group on a space. Another development in which Klein was involved was the 
Uniformization Theorem [Be] for Riemann surfaces. This theorem may be understood as saying 
that every connected two-manifold is a double coset space of the isometry group of one of the 3 
(Euclidean, hyperbolic, elliptic) standard 2-dimensional geometries. (See also the recent article [F] 
in this Monthly.) Three-manifolds are much more complex than two manifolds, but the 
intriguing work of Thurston [Th] has gone a long way toward showing that much of their structure 
can be understood in a way analogous to the 2-dimensional situation in terms of coset spaces of 
certain Lie groups. 

More or less contemporary with the final proof of the Uniformization Theorem was Einstein’s 
[E] invention of the special theory of relativity and its instatement of the Lorentz transformation 
as a basic feature of the kinematics of space-time. Einstein’s intuitive treatment of relativity was 
followed shortly by a more sophisticated treatment by Minkowski [Mk] in which Lorentz 
transformations were shown to constitute a certain Lie group, the isometry group of an indefinite 
Riemannian metric on R 4 . Similarly, shortly after Heisenberg [Hg] introduced his famous 
Commutation Relations in quantum mechanics, which underlie his Uncertainty Principle, Her- 
mann Weyl [W] showed they could be interpreted as the structure relations for the Lie algebra of a 
certain two-step nilpotent Lie group. As the group-theoretical underpinnings of physics became 
better appreciated, some physicists, perhaps most markedly Wigner [Wg], in essence advocated 
extending Klein’s Erlanger Programm to physics. Today, indeed, symmetry principles based on 
Lie theory are a standard tool and a major source of progress in theoretical physics. Quark theory 
[Dy], in particular, is primarily a (Lie) group-theoretical construct. 

These examples could be multiplied many times. The applications of Lie theory are astonishing 
in their pervasiveness and sometimes in their unexpectedness. The articles of Borel [Bo2] and 
Dyson [Dy] mention some. The recent article of Proctor [Pr] in this Monthly discusses an 
application to combinatorics. Some points of contact of Lie theory with the undergraduate 
curriculum are listed in §7. 

The article of Proctor also illustrates the need to broaden understanding of Lie theory. Proctor 
did not feel he could assume knowledge of basic Lie theoretic facts. Though hardly an unknown 
subject, Lie theory is poorly known in comparison to its importance. Especially since it provides 

Roger Howe received his Ph.D. from the University of California at Berkeley in 1969. His advisor was Calvin C. 
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and applied. 


600 



VERY BASIC LIE THEORY 


601 


unity of methods and viewpoints in the many subjects to which it relates, its wide dissemination 
seems worthwhile. Yet it has barely penetrated the undergraduate curriculum, and it is far from 
universally taught in graduate programs. 

Part of the reason for the pedagogy gap is that standard treatments [A], [Ch], [He] of the 
foundations of Lie theory involve substantial prerequisites, including the basic theory of differen- 
tiable manifolds, some additional differential geometry, and the theory of covering spaces. This 
approach tends to put a course in Lie theory, when available, in the second year of graduate study, 
after specialization has already begun. While a complete discussion of Lie theory does require 
fairly elaborate preparation, a large portion of its essence is accessible on a much simpler level, 
appropriate to advanced undergraduate instruction. This paper attempts to present the theory at 
that level. It presupposes only a knowledge of point set topology and calculus in normed vector 
spaces. In fact, for the Lie theory proper, only normed vector spaces are necessary. This 
simplification is achieved by not considering general or abstract Lie groups, but only groups 
concretely realized as groups of matrices. Since such groups provide the great bulk of significant 
examples of Lie groups, for many purposes this restriction is unimportant. 

The essential phenomenon of Lie theory, to be explicated in the rest of this paper, is that one 
may associate in a natural way to a Lie group G its Lie algebra g. The Lie algebra g is first of all a 
vector space and secondly is endowed with a bilinear nonassociative product called the Lie bracket 
or commutator and usually denoted [ , ]. Amazingly, the group G is almost completely determined 
by g and its Lie bracket. Thus for many purposes one can replace G with g. Since G is a 
complicated nonlinear object and g is just a vector space, it is usually vastly simpler to work with 
g. Otherwise intractable computations may become straightforward linear algebra. This is one 
source of the power of Lie theory. 

The basic object mediating between Lie groups and Lie algebras is the one-parameter group. 
Just as an abstract group is a coherent system of cyclic groups, a Lie group is a (very) coherent 
system of one-parameter groups. The purpose of the first two sections, therefore, is to provide 
some general philosophy about one-parameter groups. Section 1 provides background on homeo- 
morphism groups, and one-parameter groups are defined in a general context in §2. Discussion of 
Lie groups proper begins in §3. Technically it is independent of §§1 and 2; but these sections will, 
I hope, give some motivation for reading on. Those who need no motivation or dislike philosophy 
may go directly to §3. There one-parameter groups of linear transformations are defined and are 
described by means of the exponential map on matrices. In §4 the exponential map is studied, and 
the commutator bracket makes its appearance. Section 5 is the heart of the paper. It defines and 
gives examples of matrix groups, the class of Lie groups considered in this paper. Then it defines 
Lie algebras, and shows that every matrix group can be associated to a Lie algebra which is 
related to its group in a close and precise way. The main statement is Theorem 17, and Theorem 
19 and Corollary 20 are important complements. Finally §6 ties up some loose ends and §7, as 
noted, describes some connections of Lie theory with the standard curriculum. 

Bibliographical note : The arguments of sections 3, 4 and 5 are very close to those given by von 
Neumann [Nn] in his 1929 paper on Hilbert’s 5th problem. A modem development of basic Lie 
theory which incorporates these results is [Go]. 


1. Homeoniorphism Groups 

In this section, we use the standard terminology of general topology, as for example in [Ke]. 

Let X be a set. Then the collection Bi(20 of bijections from X to itself is a group with 
composition of mappings as the group law. Now suppose X is in fact a topological space. Then the 
set Hm(J) of homeomorphisms from X to itself is a subgroup of Bi( X). It seems natural to try to 
topologize Hm(I). The topology should of course reflect how Hm( X) acts on X, so that maps 
close to the identity move points very little. But a topology on Hm(Y) should also be consistent 
with the group structure of Hm( X). More precisely and generally, given a group G, if it is to be 
made into a topological space in a manner consistent with its group structure, the topology it is 
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given should satisfy two conditions. 

(1.1) (i) The multiplication map (g l9 g 2 ) -* g\g 2 from G X G to G should be continuous. 

(ii) The inverse map g g” 1 from G to G should be continuous. 

A topology on G satisfying these two compatibility criteria is called a group topology. A group 
endowed with a group topology is called a topological group. Some standard treatments of 
topological groups are [Hn] and [P]. 

In short, then, we would like to make Hm(G) into a topological group. This is not so 
satisfactorily done for completely general X , but if X is locally compact Hausdorff, there is a nice 
topology on Hm( X), known as the compact-open topology. Before defining it, we make some 
general observations about group topologies. These will simplify the definition. 

Given a group G and an element geG, define A g , left-translation by g, and p g , right-translation 
by g, to be the maps 

(1.2) X g : G -* G p g : G — > G 
given by 

Ms') = Pg(s') = S'g -1 . 

For [/cG, set 

(1.3) gU = X g (U) Ug = Pg -i(U). 

Lemma 1. Let G be a topological group , and g e G. 

(a) The map X g : G -* G is a homeomorphism. Similarly p g : G -* G is a homeomorphism . 

(b) If U c G is a neighborhood of the identity 1 G of G, then gU and Ug are neighborhoods of g. 
Similarly if V c G is a neighborhood' of g, then g~ x V and Vg~ l are neighborhoods of 1 G . 

Proof. One checks from the definition of X g that X is a homeomorphism, i.e., 

( 1 - 4 ) \ g o\ h = \ gh 

for g,h e G. It follows directly from the condition (1.1) (i) that A g is continuous. Likewise, the 
map X g -i is also continuous. From (1.4) one concludes that 

(1.5) v = (M" 1 - 

Hence X g is continuous with continuous inverse, that is, a homeomorphism. The proof for p g is 
essentially identical. 

Since X g (l G ) = g and X g (U) = gU by definition, part b) follows since the homeomorphic 
image of an open set is open. ■ 

Corollary 2. A group topology is determined by its system of neighborhoods of the identity. 

Proof. Indeed, a topology on G is determined by the collection of neighborhood systems of 
each point of G. But according to part (b) of the lemma, for a group topology, the system of 
neighborhoods around a point g e G is determined by the system of neighborhoods around l G . ■ 
Let us call a topology on G such that all X g and p g are homeomorphisms a homogeneous 
topology. Lemma 1 says group topologies are homogeneous. Evidently Corollary 2 applies to all 
homogeneous topologies, not only group topologies. Thus an obvious question is what conditions 
must a neighborhood system at the identity satisfy in order that the associated homogeneous 
topology be a group topology? This question has a simple answer. 

Lemma 3. A homogeneous topology on a group G is a group topology if and only if the system of 
neighborhoods of 1 G satisfies conditions (a) and (b) below. 

(a) If U is a neighborhood ofl G , there is another neighborhood V of 1 G such that V c U~ l ; where 

(1.6) IT 1 - U _1 :ge U). 
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(b) If U is a neighborhood of 1 G , there are other neighborhoods V , W of 1 G such that VW c U, 
where 

(1.7) VW= {gh: gt=V,he W). 

Proof. The conditions (a) and (b) are clearly necessary for a topology to be a group topology, 
since they are one way of stating that the inverse and multiplication maps are continuous at 1 G . 
We will check this for condition (b). In order for multiplication to be continuous at 1 g X 1 G , 
given a neighborhood U c G of 1 G , we must find a neighborhood U' c G X G of 1 G X 1 G such 
that for any point (g, g') of U\ the product gg' is in U. But by definition of the product topology 
on G X G, any neighborhood U' of 1 G X 1 G contains a product V X W, where F, W c G are 
both neighborhoods of 1 G . But the image of F X W under the multiplication map is just the set 
VW defined in (1.7). So condition (b) amounts to continuity of multiplication at the point 
1 G X 1 G €= G X G. 

Thus to complete the lemma we need to show that if the multiplication and inverse maps are 
continuous at the identity, and if the topology on G is homogeneous, then they are continuous 
everywhere. Let U be a neighborhood of l G . Then since X g is a homeomorphism, gU is a 
neighborhood of g, and we may write 

(1.8) (guy 1 = = Pg(« _1 ) = Pg((X g -i(gM)) u<zU. 

Thus on gU , the inverse map is a composition of X -i, the inverse map on 17, and p g . Since X g -i 
and p g are continuous, and X g -i takes g to 1 G , and the inverse map is continuous at l c , we see that 
the inverse map is continuous at g also. The proof that multiplication is continuous everywhere is 
analogous and is left as an exercise. ■ 

We return to the question of tppologizing Hm(I). Corollary 2 allows us to save work in our 
definition of the topology on Hm(I) by only defining neighborhoods of the identity map l x on 
X , and declaring by fiat all left or right translates of these neighborhoods also to be open sets. 
Lemma 3 tells us what we must check to know our definition yields a group topology. 

From now on, we take X to be a locally compact Hausdorff space. Let C c X be compact, and 
let O d C be open. Define 

(1.9) U(C,0 ) = {/ie Hm(I): h(C) c 0,h~ l (C) c O }. 

If {C,}, 1 < / < n, are compact subsets of X, and {£>, } are open subsets of X such that C, c O t , 
set 

(1.10) t/({c,}{o,))= n u(c„o,). 

i = 1 

Definition. Let X be a locally compact Hausdorff space. The compact-open topology on 
Hm(I) is the homogeneous topology such that a base for the neighborhoods of l x consists of the 
sets U({Ci}, {O,}) of equation (1.10). 

Proposition 4. The compact-open topology on Hm(X) is a Hausdorff group topology. 

Proof. Since we have decreed the compact open topology to be homogeneous, we need only 
check the conditions of Lemma 3 to show it is a group topology. Condition (a) is automatic since 
the sets U(C,0) are defined to be invariant under the inverse map on Hm(A"). Let us check 
condition (b). If U G V t and W t are neighborhoods of l x such that V l W i c U G then evidently 

Hence since the sets (1.10) are intersections of the sets U(C, O) of (1.9), it will be enough to check 
condition (b) with the neighborhood U of the form U = U(C,0). Since X is locally compact 
Hausdorff, we can by a standard separation theorem (cf. [Ke, Chap. 5, Theorem 18]) find an open 
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O' c X such that the closure C' of O' is compact, and 

C c O' c C' c O. 

Then set F = W = U(C', O) n tf(C, O'). If h l9 h 2 e V , we find 

VMC) = *i (* 2 (C)) c ^(O') c A^C") c O 

and similarly for (/i! ° h 2 )~ 1 = 1 0 Thus h l °h 2 ^ U(C, O) = t/, or FIT c (/as was to be 

shown, and the compact-open topology is a group topology on Hm( X). 

To show that a group topology is Hausdorff is a fairly simple matter. We record the relevant 
observation as a separate result. 

Lemma 5. Let G be a topological group. Let H c G be the intersection of all neighborhoods ofl G . 
Then H is a normal subgroup of G. Further , G is Hausdorff if and only if H = {1 G }. 

Proof. Suppose h l9 h 2 e H. Given a neighborhood U of l c , we can find neighborhoods F, W 
of 1 G such that VW c U. Since h x e F and /* 2 e IF, we see that G U. Hence h 1 h 2 e H also. 
In similar fashion, one sees that h f 1 e //. Hence 7/ is a group. Since the conjugate gUg~ l of a 
neighborhood of 1^ is again a neighborhood of l c , we see that 7/ is also normal in G. 

Suppose H = 1 G . Then given g e G, we can find a neighborhood £/ of 1 G such that g £ U. Let 
F, IF be neighborhoods of 1 G such that FJF c U. Then gF _1 and IF are neighborhoods of g and 
of l c , respectively, and are disjoint. Now consider any two points g l9 g 2 e G. Set g = gf ^ 2 ? uud 
apply the argument above. Translating on the left by g l9 we find g x W and g 2 F _1 are disjoint 
neighborhoods of g x and g 2 , respectively. Hence G is Hausdorff. ■ 

From Lemma 5 we see Proposition 4 will be proved if we produce for each h =f= l x in Hm( X) a 
compact C and open O such that h £ U(C,0). Choose x e X such that h(x) j= x. Then 
evidently h £ U({x},X- {/z(x)})„B 

Remark. In fact the compact-open topology on Hm(I) is better than Proposition 4 indicates. 
It is complete with respect to an appropriate uniform structure ([Ke, Chap. 6]). Also, if X is second 
countable (hence metrizable), then Hm( X) is also second countable and metrizable. 

2. One-Parameter Groups: Flows and Differential Equations 

The real number system U equipped with addition and its familiar topology is, as the reader 
may easily check, a topological group. 

Definition. A one-parameter group of homeomorphisms of (the locally compact Hausdorff 
space) X is a continuous homomorphism 

(2.1) <p:IR -► Hm(l). 

It will be convenient to denote the image under <p of t by <p, rather than (p(t). Thus {<p,} is a 
family of homeomorphisms of X satisfying the rule 

(2.2) <P,°<Ps = <P,+s t,seR. 

Since for each t the map <p, acts on X, a one-parameter group of homeomorphisms of X is also 
called an R -action on X, or an action by R on X. 

Given a one-parameter group cp t of homeomorphisms of X, we can define a map 

(2.3) 0:R XX-+ X, 

®(t,x) = <p,(x). 

The fact that t -> cp t is a homomorphism is captured by the identities 

(2.4) (i) $(0, x) = x, 

(ii) 0(s, $(/, x)) = $($ + t,x). 
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The continuity of <p is reflected in the continuity of O. We state this fact formally. It will perhaps 
also shed some light on the significance of the compact-open topology on Hm(I). 

Lemma 6. Let $ :R X X X be a map. For t e [R, define <p, : X -> X by formula (2.3) (ii). 
Then {<p r } is a one-parameter group of homeomorphisms if and only if 

(a) <3> satisfies identities (2.4) and 

(b) <3> is continuous. 

Remark. According to this lemma, if our goal were simply to define a one parameter group of 
homeomorphisms in the quickest way, we could short-circuit the whole discussion of §1 and 
simply define a one-parameter group of homeomorphisms as a map O satisfying the conditions of 
the lemma. However, that approach seemed unduly formalistic. 

Proof. It is a straightforward computation to verify that the identities (2.4) guarantee that for 
each t the map <p, is in Bi(X) and t -> <p t is a homomorphism. Also it is obvious that the maps <p t 
will be in Hm(I) if and only if is continuous in x for each fixed t. Thus the main thrust of the 
lemma is that t -> <p t is continuous from R to Hm(I) if and only if 0 is jointly continuous in t 
and x. Let us verify this. 

Suppose 0 is continuous. Let C clbe compact, and Oclbe open, with C c O. Choose 
any x e C. By identity (2.4)(i), the point (0, x) e R X X is in O -1 (0). By continuity of 0 a 
neighborhood of (0, x) is contained in O -1 (0). This means there is a neighborhood N of x in X , 
and 8 > 0, depending on x, N, and O , such that $(t, y) e 0 fory e N and |/| < 8. In other words 
<p t (y) e O for |/| < 8 and y e N. Since C is compact, we can find a finite number of x, e C such 
that the associated neighborhoods N t cover C. Suppose then that <$(/, yf) e O for y t e N t and 
|/| < 8 t . Set 8 = min 8 t . Then we have <!>(/, c) e O for all c <e C and |/| < 8. In other words, 
q) t G U(C, O) for |/| < 8. Clearly, by repeating this argument for any finite collection of compact 
Cfs and open Of s containing them, we can show that <p, e f/({C y }, {OJ) for all sufficiently 
small t. This shows that t -» <p, is continuous at the origin in R. But now we appeal to the 
following lemma. 

Lemma 7. Let <p : G -> H be a homomorphism between topological groups. Then (p is continuous if 
and only if (p is continuous at 1 G . 

The proof of this lemma is left as an exercise to the reader, who will recognize in it the same 
spirit that informs Lemmas 2, 3, and 5. 

To finish Lemma 6, we must show that the continuity of t -> <p t implies continuity of 0. 
Choose ( t Y x ) gRXI, and set y = $(/, x). Let F be a neighborhood of y. Since <p, is 
continuous, we can find a neighborhood W of x, with compact closure W, such that <p r (W) c V. 
Since <p, is continuous in t, we can find e > 0 so that (p s e £/(<p,(JT), V) for |s| < e. But then if 
(t\ w) e (f - a, / + e) X W, we have 

w)) = e (p,>-,((p t (W)) c V. 

In other words (t - e, t + e) X W c 0 _1 (F). Since V was an arbitrary neighborhood of y, we see 
$ is continuous at ( t , x). Since ( t , x) is arbitrary, we see <3> is continuous. ■ 

Consider a one-parameter group <p t of homeomorphisms of X and the associated map 0 
defined by formula (2.3). The map O is a function of two variables, t and x, and the maps % are 
obtained from 0 by temporarily fixing t and letting x vary. If on the other hand we fix x and let t 
vary, we get a map t -> <£>(/, x) = <p,(x) which defines a continuous curve in X, traced by the 
moving point <p,(x). Thus as t varies, each point of x moves continuously inside X, and various 
points move in a coherent fashion, so that we can form a mental picture of them flowing through 
X, each point along its individual path. For this reason, a one-parameter group of homeomor- 
phisms of X is also sometimes called a flow on X. 
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The notion of a flow is closely related to the theory of differential equations. Indeed, let 
X = U n , and write 

x = (x 1? . . . ,x„) x e R", x , e R. 

Then 

0(r,x) = ®(t,x 1 ,x 2 ,...,x n ) = 

is a function from R" +1 to R". Suppose 0 is not merely continuous, but differentiable. Define 

/: R" -» R" 
by 


(2.5) 
If we 

( 2 . 6 ) 


f)<t> 

f ( x ) = ~jf(cx) 


(-0 


If we differentiate (2.4) (ii) with respect to s , and set s = 0, we obtain 

-/(*(/,*)). 


In other words, for fixed x , the map y x (t) = ®(t, x) is a solution of the system of differential 
equations 


(2.7) 


<fy_ 

dt 


= f(y), 


or 


dt 




for 1 < i < n . 


The solution^ of (2.7) is the solution of (2.7) with initial condition y x (0) = x. 

The system (2.7) may be pictured geometrically as follows. At each pointy e U n 9 one draws 
the vector f(y) = (/i(y), / 2 (y),. . . ,/„(y)). This gives a family of vectors which vary smoothly as 
y varies; such a family is called a vector field. A solution of the system (2.7) is a parametrized 
curve c(t ) in R”, such that at each point c(0 of the curve the tangent vector c\t) is the 
pre- assigned vector f(c(t)). The 2-dimensional system 


djx, y) 

dt 


(-V>x) 


whose solutions are the circles 

(x(/),y(*)) = (acos(0 o + 0? 0 sin(0 o + 0) 

is illustrated in Fig. 1. 

Suppose on the other hand that for each x we have a solution y x {t) of the system (2.7) with 
initial condition y x (0) = x. For s e jR, consider the function 

y x ,s(‘) = yx(t + *)• 

Differentiation of y x s shows it also is a solution of the system (2.7), evidently with initial value 
y x 5 (0) = y x (s). The uniqueness part of the Existence and Uniqueness Theorem for ordinary 
differential equations [L], [HS], [R], therefore implies that 


( 2 . 8 ) 


y x (s + 0 -y x ,s(0 =y yil (s)(0- 


If we then set 


Q(t,x) =y x (t), 

we find that identity (2.8) translates into identity (2.4) (ii). Of course, the initial condition 
y x (0) = X is just identity (2.4) (i). It follows that <p,(x) = y x (t) defines a one-parameter group of 
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homeomorphisms of R". For the system of Fig. 1, the map <p, is just rotation through an angle of t 
radians. 

In summary, we have seen that a (smooth) one-parameter group of diffeomorphisms of R" 
yields solutions of a system of differential equations of the form (2.7), and, conversely, a solution 
(for all time and all x) of system (2.7) yields a one-parameter group. The two constructs, solutions 
of systems of ordinary differential equations, and one-parameter groups, thus provide two 
different points of view on the same mathematical phenomenon. In other words, the notion of 
one-parameter group provides a geometric and global way of looking at the solutions of a system 
of ordinary differential equations.^ As such, it suggests ways of attacking and obtaining informa- 
tion about ordinary differential equations, and it provides a link between systems of ordinary 
differential equations and more complex geometric objects such as the Lie groups and Lie 
algebras discussed in the following sections. 

3. One-Parameter Groups of Linear Transformations 

In this section, we show how one-parameter groups of linear transformations of a vector space 
can be described using the exponential map on matrices. 

Let Fbe a finite dimensional real vector space. Let End(F) denote the algebra of linear maps 
from F to itself, and let GL(F) denote the group of invertible linear maps from F to itself. The 
usual name for GL(F) is the general linear group of F. If F = R n , then End(F) = M n (R), the 
n X n matrices, and GL(F) = GL„(R), the matrices with nonvanishing determinants. 

Let || || be a norm on F (c.f. [L], [N]). In the usual way there is induced an operator norm, also 
denoted || ||, on End(F). We recall the definition: 

(3.1) IMII - sup{^ : d e F-{0}J A e End(F). 

The norm on EndF makes End(F) into a metric space. Since the determinant is a continuous 
function on End(F), we know that GL(F) is an open subset of EndF (see also (3.6) below), so it 
also is a metric space. 

Definition. A one-parameter group of linear transformations of F is a continuous homomor- 
phism 

(3.2) M: R GL(F). 

Thus M(t) is a collection of linear maps such that 

(i) M (0) = l v , the identity of F, 

(ii) M(s)M(t) = M(s + t) $,f€R, 
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(iii) M ( t ) depends continuously on t . 


Remarks, (a) The topology on GL(V ) is easily verified to be a group topology as defined in 1. 
Thus, for A e End V and r > 0, set 

(3.3) { v4' e End V : \\A' — v4|| < r } . 

First, the basic formula [N, p. 76], 

(3.4) \\AB\\<\\A\\\\B\\ 

implies that left and right multiplication are continuous. Hence the topology is homogeneous. 
Then the Neumann formula [N, p. 177], 

00 

(3.5) (ly-Ay 1 - LA" 

n — 0 

vahd for A with ||>4|| < 1 shows that 

(3.6) & r (l v )) c & s (l v ) 

with s = r/( 1 - r). Similarly the formula 

(3.7) (1^ 4- A)(l v 4- B ) = 1 v -h A + B + AB 
shows 

3Sr(\y)^S s {ly) C &r+s + rs iXv ) * 

Thus all the conditions of Lemma 3 are checked, and we have a group topology. 

(b) Furthermore, it is not difficult to verify that the topology defined by the norm coincides 
with the compact-open topology defined in §1 on GL(K) as a subgroup of Hm(K). This is left as 
an exercise. Hence this definition of one-parameter group is a special case of the definition of §2. 
For A e End V, define 

00 jn 

(3.8) exp(^)=i;- T . 

n = 0 f 

Since ||^4 /l || < \\A\\ n , we see, by the standard estimates in the exponential series, that the series 
defining exp A converges absolutely for all A and uniformly on any ^,.(0). Hence exp defines a 
smooth, in fact analytic, map from End( K) to itself. We will see shortly that in fact exp A e 
GUV). 

Proposition 8. If A and B in End V commute with each other , then 

(3.9) exp(^4 + B) = exp^expi?. 


Proof. Computing formally we have 


exp A exp B = £ )[ £ 


A n 


n = 0 


m = 0 


B m 

ml 


E 

n , m = 0 


A n B m 

nlml 



m + n = l 


/! 

mini 


A n B n 




If A and B commute, the familiar binomial formula applies and says 
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Substituting this in our formula for exp A exp B , and noting that all manipulations are valid 
because the series converge absolutely, we see the proposition follows. ■ 

Corollary 9. For any A e End V , the map t — > exp(L4) w a one-parameter group of linear 
transformations on V. In particular exp A e GL(F) and (exp(^)) 1 = exp(— A). 

Proof. Since for any real numbers s and t the matrices sA and tA commute with one another, 
this corollary follows immediately from Proposition 8. ■ 

The main result of this section is the converse of Corollary 9. 

Theorem 10. Every one-parameter group M of linear transformations of V has the form 

(3.10) M(f) = exp(rT) 
for some A e End V. 

The transformation A is called the infinitesimal generator of the group t — ► exp {tA). The flow 
illustrated in Figure 1 is in fact given by a one-parameter group with infinitesimal generator 

0 -ll 
.1 0 .' 

Remark. Since for v g V we have 

00 t n A n (iA 

(exp tA)(v) = v + tA(v) + L — 

n = 2 

the infinitesimal generator of the one-parameter group M(t) = exp (tA) can be computed by the 
formula 

(3.11) A(v)- iia - j-(M(t)(v)) . 

t-* o t at /=o 

Thus the one-parameter group M(t) is associated by the discussion at the end of §2 to the system 
of differential equations 

(3.12) — = A(v). 

These equations are of course basic in the theory of linear systems, which is applied in electrical 
engineering, economics, etc. If we know that M(t)(v) is differentiable, then the existence and 
uniqueness theorem for differential equations implies Theorem 10, but we do not know a priori 
that M(t) is differentiable. The burden of the proof of Theorem 3 is to get around this ignorance, 
thereby establishing that a merely continuous map t -* M(t) satisfying the group law (3.2) (ii) is 
in fact analytic. This is a recurrent theme in Lie theory, and is also expressed in the main theorem 
(Theorem 17) of this paper. It found its ultimate expression in Hilbert’s 5th Problem: to show that 
if a topological group is locally (i.e., a neighborhood of every point is) homeomorphic to 
Euclidean space, then the group is in fact an analytic manifold with analytic group law (a Lie 
group). This problem was resolved positively in the early 1950’s by A. Gleason [G]. See also [Ka], 
[MZ]. 

We take up now the proof of Theorem 10. It will require some preliminary results. 

Let & r (A) be the open ball of radius r around A , as defined in formula (3.3). 

Proposition 11. For sufficiently small r > 0, the map exp takes & r (0) bijectively onto an open 
neighborhood of l v in GL( K). One has exp(^(0)) c & s (l v ) where s = e r — 1. 

Proof. Let Dex p A be the differential of exp at A. It is a linear map from End( V) to End( V) 
defined, by 

Dexp,(B) - fa !3id±JlhimA, 
t~* o t 

From the definition (3.8) of exp, it is easy to compute that 

Z)exp 0 (Z?) = B. 
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That is D exp 0 is the identity map on End(F). In particular D exp 0 is invertible. Therefore the first 
statement of the proposition follows from the Inverse Function Theorem [L], [R]. The inclusion 
exp( ^ r (0)) c & s (l v ) follows from the obvious termwise estimation of exp(^) - l v . ■ 

Remark. If one defines 

00 A n 

(3.13) log(l v -A) = - £ — , 

n = 1 

then just as for real numbers, one sees this series converges absolutely for \\A\\ < 1. Further, for all 
B e &i(l v ) one has 

(3.14) exp (log B) = B . 

Formula (3.14) is known in the scalar case, and this implies that in fact (3.14) is an identity in 
absolutely convergent power series, whence it follows in the matrix case. The formulas (3.13) and 

(3.14) allow an alternate proof of Proposition 11 which avoids appeal to the Inverse Function 
Theorem and gives the explicit estimate that exp is 1-1 on @i og2 (0)- However, this explicit value of 
r is not needed, and we need in any case to appeal to the Inverse Function Theorem below in 
Theorem 17, so this more explicit proof of Proposition 11 gives us no particular benefit. 

Proposition 12. Choose an r < log 2, and let T be in exp^ r (0), say T = exp A. Then the 
transformation S = exp(T /2) is a square root of T\ that is, S 2 = T. Moreover , S is the unique 
square root of T contained in exp ^ r (0). 

Proof. That S 2 = T follows directly from Proposition 8. It is only necessary to prove the 
uniqueness of S. From Proposition 11, we see that our restriction on r implies exp & r (0) c ^(1 K ). 
Hence it will suffice to show that if A, B are distinct linear maps of norm less than 1, then 
(1 v + A) 2 f (1 v + B) 2 . Suppose the contrary. Then expanding the squares, cancelling the 1/s 
and transposing, we find the equation 

2 (A - B) = B 2 - A 2 = B(B - A) +(£ - A)A. 

Taking norms yields 

2 \\A - B|| < \\B\\ p - A\\ + p - A\\ \\A\\ = (p|| + |M||)P - A\\. 

This implies either \\A - B\\ = 0, which- is false since A f B, or p|| + \\B\\ > 2, which is false 
since both p|| and p|| are less than 1. This contradiction establishes the uniqueness of S. ■ 

Proof of Theorem 10. Let t ->• M(t) be a continuous one-parameter group in GL(K). Since 
M( 0) = l y , if we specify r > 0, we may by continuity and Proposition 11 find an e > 0 such that 
M(t ) e exp(^ r (0)) for |t| < e. We take r < log2. Write 

M(e ) = exp^x 

for appropriate A l e ^ r (0). If we set 



then M(e) = exp(e/l). The transformations M(e/ 2) and cxp((e/2)A) are then both square roots 
of M( t) lying in exp(^ r (0)). By Proposition 12 we conclude 

M( e/2) = exp((e/2 )A). 

An obvious induction using Proposition 12 shows that 

M(2~"e ) = exp(2 _ "e/l) 

for all positive integers n. Taking m th powers, we conclude 

M(m2~ n e) = exp(w2 _ " e A) 
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for all integers m and n. Since the numbers ml n e are dense in R, Theorem 10 follows by 
continuity. ■ 

4. Properties of the Exponential Map 

The map exp is the basic link between the linear structure on End V and the multiplicative 
structure on GL(K). We will describe some salient properties of this link. 

Choose r with 0 < r < 1/2 such that exp is one-to-one on & r (0). Choose r x < r so that if 
A, B e ^ (0), then exp A exp B is contained in exp @ r (0). Then we can write 

(4.1) exp A exp B = exp C 

for some C e & r (0). The Inverse Function Theorem guarantees that C is a smooth (in fact 
analytic) function of A and B. There is a beautiful formula, the Campbell-Hausdorff formula [Jl], 
[Se], which expresses C as a universal power series in A and B. To develop this completely would 
take too long. We will just give the first two terms in the expression for C. These suffice for most 
purposes. 

For A, B e End V, write 

(4.2) [A,B]= AB -BA. 

The quantity [A, B] is called the commutator of A and B , and will be seen later to provide the Lie 
bracket operation in the Lie algebras we construct. 

Proposition 13. Suppose A, B,C have norm at most 1/2 and satisfy equation (4.1). Then we 
have 

(4.3) C = A + B + \[A,B] + S, 
where the remainder term S satisfies 

(4.4) ||5||<65(|M|| + p||) 3 . 

Proof. We have 

(4.5) exp C = l v + C + ^i(C), 
where the remainder Ri(C) is 

oo n 

*,( 0 - L -^r 

n = 2 

and satisfies the obvious estimate 

Pi(c)IKI|C 2 ll( £ ^)<IICII 2 

when He'll < 1, hence certainly when ||C|| < 1/2. 

Similarly we have 

(4.6) exp A exp B = l v + A + B+ R X {A, B) y 
where by rearrangement of the double sum 

Hence we have the estimate 

Pi^P)!! < (Mil + Pll) 2 ( £ M±JM — j < ( M || + pi,) 2 
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when ||4| + \\B\\ < 1. 

Comparing equations (4.5) and (4.6), we see that equation (4.1) implies 

(4.7) C = A + B + Ri(A, B) - R^C). 

Hence 

Mil < Mil + \\B\\ + (Mil + Mil) 2 + lie’ll 2 < 2(MII + Mil) + |||C|| 

when A, B and C all have norm at most 
Thus 

(4.8) ||C|| < 4(MII + MU). 

Returning to equation (4.7), we further find 

(4.9) lie ~(a + B)|| < MiMM)II + Mi(c)|| < (Mil + Mil) 2 +(4(MII + Mil)) 2 

= 17(MII + Mil) 2 - 

We now refine these estimates to second order. In analogy with (4.5) we have 

r 2 

(4.10) exp C - l v + C + — + R 2 (C), 
where 

00 n 

1 “7 

n = 3 

is easily estimated by 

(4.11) ' P 2 (c)||< (j)mII 3 

when He'll < 1. 

If we substitute expression (4.3) for C in equation (4.10), we obtain 

(4.12) exp C = l v + A + B + ^[A,B] + S + \^C 2 + R 2 (C) 

= ly + A + B. + |m, B] + + B) 2 + T 

= ly + A + B + |m 2 + 2 AB + B 2 ) + T, 

where 

T= S + ^(c 2 -(A + B ) 2 ) + R 2 (C). 

On the other hand, we have 

(4.13) exp A exp B = l y + A + B + ^(A 2 + 2AB + B 2 ) + R 2 (A, B), 
where 

satisfies ||R 2 M, 5)|| < y (Mil + MID 3 when Mil + Mil < 1- 
Comparison of (4.12) and (4.13) in the light of (4.1) yields 

S = R 2 {A, B) +\({A + Bf - C 2 ) - R 2 {C). 
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Taking norms, we find 

\\S\\ < ||* 2 (i4, 5)11 + \{\\{A + B)(A + B— C)+(A+B — C)C||) + ||5 2 (C)|| 

< |(lMH + ||5||) 3 + i(|M|| + \\B\\ + ||C||)|M + B - C|| + |||C|| 3 

< y(IMII + PH ) 3 + f (Mil + PII) • 17(|MH + 1|5||) 2 + j(4(M|| + ||5||)) 3 

< 65(M|| + ||5||) 3 . 
as was to be shown. 

We will derive two main consequences of Proposition 13. These relate group operations in 
GL(F) to the linear operations in End(F), and are crucial ingredients in the proof of the main 
theorem (Theorem 17 in §5) that relates Lie algebras to Lie groups. Proposition 14 relates group 
multiplication in GL(F) to addition in End(F), and Proposition 15 relates the group commutator 
operation to the bilinear commutator bracket defined in equation (4.2). 

Proposition 14 (Trotter Product Formula). For A, B e End V, one has 
(4.14) exp(^4 + B) = lim (exp(A/n)exp(B/n)) 

n-+ oo 

Proof. For n large enough, A /n and B/n will be close enough to the origin that formula (4.3) 
applies. We then have 

exp( A /n )exp( B/n ) = exp C n , 


where by estimate (4.9) 

IIQ ~(A+ B)/n\\ < 17((M|| + ||5||)/«) 2 . 

Hence as n -> oo, we see that nC n -> A + B. Since exp nC n = (exp C n ) n , equation (4.14) fol- 
lows. ■ 

Recall that the (linear) commutator [A, B] is defined in equation (4.2). Recall also that if g, h 
are elements of a group, then the group commutator of g and h, written (g : h), is the expression 

(g: h) - ghg~ x h~ x . 

Proposition 15 (Commutator formula). For A, B e End V, one has 
(4.15) exp [A,B]= lim ( exp(A/n)exp(B/n)exp(-A/n)exp(- B/n ))" 

n-* oo 

= lim ((exp(A/n) :exp(B/n)) n . 


Proof. As in Proposition 14, for large n we have 


exp{A/n)exp(B/n) = exp C n = exp 


(, +s) /„ + m 

L n z 



where 


Similarly 


IIQII < 65 


(Mil + Pip 3 


exp( —A/ n )exp( — B/ n ) = exp|-(^ + B)/n +(f ) = ex P c « 
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with also 


Hence 


where 


Ill'll < 65 (M1±1W 

n 


(exp( A/n ) : exp(5/«)) = Exp C„exp C„' = Exp E n , 


E n = C„ + C'n + 2 ] + T n 

= ~ ’2 “ + 2 [Cl > Q ] + S n + > 


where T n is the term S in equation (4.3) if A = C n and B = C n . 

It will suffice to show that there is a number y, depending on A and B , such that 

[Ail . 1 


For then 


(exp E n ) n = exp([/4, B] + U„) 


with \\U n \\ < y/n, and equation (4.15) follows. In turn, it will suffice to show that the 2nd, 3rd, 4th 
and 5th terms in the expression for E n are each less than a constant times n~ 3 . For S n , and T n , 
this follows from Proposition 13. Thus we need only worry about [C n , C']. We compute 

[C„,Q] = 

- \[A + B,[A,B]] +\[A+B,S n + S'] + ~^[[A, B], S' - S„] 

n n 2 n 

+[s„,sn. 

Using Proposition 13, we see that each of the four terms in this last sum is bounded by a constant 
times n~ 3 . (In fact, all terms except the first are bounded by a constant times n~ 4 ). 

There is one further concept involving the exponential map that is basic to Lie theory. It 
involves conjugation, which is generally referred to as the “adjoint action.” For g e GL(K) and 
T e End V, we can form the conjugate 

(4.16) Ad g(A) = gAg -1 . 

The following proposition is easily verified and left as an exercise. 

Proposition 16. (i) Ad g(aA + bB) = aAdg(A) + b Adg(B) for A,B e End V; a,b e R; 
and g e GL( V ). 

(ii) Ad g{AB) = Adg(/l)Adg(5). 

(iii) Ad g!g 2 (A) = Adg 1 (Adg 2 (^()). 

Formulas (i) and (ii) say Adg is an algebra automorphism of End V, and Formula (iii) says the 
map Ad:g-> Adg is a group homomorphism from GL(K) to the automorphism group of 
End(F). The map Ad is called the adjoint action of GL(F) on End(F). 

Formula (iii) implies in particular that if exp tA is a one-parameter subgroup of GL(F), then 
Ad exp tA is a one-parameter group of linear transformations on End F. Hence Ad exp tA has 
infinitesimal generator j/e End(End F). We can compute j/ by the formula 


~( A + B ) +j-^[A, B] + S„, + B) + — 

n 2 n n In 


2 [a,b] + s; 
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^(s)-to <exp, - 4Wexp( ~ ,/ ‘ )) ~ B 

t-* o t 

= -J t (exp <y4)fi(exp(-M))|,_ 0 

= (A(exp tA)B(exp - tA ) + (exp tA)B(-A)(exp - tA))\ t=s0 
= AB - BA = [A,B]. 


Here we have used the fact that 


-^(exp(t4)) =^exp(M). 

This formula may be verified by direct calculation from the definition of exp (tA). Hence if we 
define 


by 


adT : End V -> End V 
ad A(B) = [ A,B ], 


we have the following formula. 

Proposition 17. For A e End V 
(4.17) Ad(exp^) = exp(ad^4). 


5. The Lie Algebra of a Matrix Group 

By a matrix group we mean a closed subgroup of GL(K) for some vector space V. This section 
shows a matrix group is a Lie group. What that means is expressed in Theorem 17. Most, though 
not all, Lie groups can be realized as matrix groups. This article discusses only matrix groups. 


Examples, (i) GL(K) itself. 


(ii) SL„(R), the special linear group, of n X n matrices of determinant 1. 

(iii) O p q , the “pseudo-orthogonal groups,” consisting of all matrices in GL p+q (U) that 
preserve the indefinite inner product 

P p + q 

(x, x') p , q = Yj x i x i ~ H x i x U x > X ' G ^ p+q - 

i=l i—p 4-1 

(iv) SP 2 „(R), the real symplectic group, consisting of all matrices in SL 2 „(K) that preserve 
the skew-symmetric bilinear form 

rt 

(x, X ) = Y x i x i + n ~ x i x i + n X > X ^ ^ 
i = l 

(v) The group P(U ) of transformations that preserve a subspace U of V. For instance, if 
V= R", and U m = U m = {(x 1? x 2 ,. . . ,x m ,0,0,. . . ,0)}, where m < n, then 



X 

B. 


: A e GL W (R), B g GL„_ m (R), X e 


Here M m „_ m (R) is the space of m X (n - m) real matrices. 

(vi) Any intersection of matrix groups is a matrix group. For instance, the intersection 
n n m=l P(U m ) of the groups P(U m ) of example (v) is the group of invertible upper 
triangular matrices. 

(vii) The group preserving some closed subgroup, not necessarily a subspace, of V. For 
example, let Z n c IR” be the discrete subgroup of vectors with integral entries. Set 
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GL„(Z)= {A eGL„(R):/((2") = Z"}. 

Then GL„(Z) can also be shown to consist of matrices with integer entries and 
determinant +1. 

(viii) The group commuting with some family { 7] } of operators on V is a matrix group. For 
example, we can identify C ” with U ln by letting x 2j _ 1 and x 2j be the real and imaginary 
parts of the coordinate z y - of z = (z 1 ,z 2 ,...,z„)gC". If we do so, the operation of 
multiplication by a complex scalar becomes some (real) linear operator onU 2n . Further, 
the group GL„(C) becomes identified with the subgroup of GL 2 „(R) formed by 
elements which commute with the multiplications by complex scalars. 

(ix) If G, is a matrix group in GL(J^), i = 1, 2, then G x X G 2 is a matrix group in 
GL(F 1 0 V 2 ) in the obvious way. 

(x) If G is a matrix group, then G°, the connected component of the identity in G, is a 
matrix group. 

(xi) The normalizer in GL(K) of a matrix group is a matrix group. 

The main result of this section is the essential phenomenon behind Lie theory: a matrix group 
has naturally attached to it a Lie algebra. Before showing this we recall what a Lie algebra is. 

Definition. A real Lie algebra g is a real vector space equipped with a product 

[»] : 0 X 8 -» 8 


(Bilinearity). For a, b e U and x, y, z e g, 

[ax + by , z] = a[x 9 z] + b[y , z] 

[z, ax + by] = a[z, x] + b[z , y]. 

(Skew symmetry). For x, y e g, 

[*,>>] = 

(Jacobi Identity). For x, y, z e g, 

[x,[y, z]] + [z,[x, y]] + [y,[z, x]] = 0. 

The first main example of a Lie algebra is End V equipped with the bracket operation [ , ] of 
commutator, as given in equation (4.2). It is left as an exercise to verify that this satisfies the 
correct identities. Any subspace of End V which is closed under [ , ] will become a Lie algebra in its 
own right. Since our main theorem will provide us with such a subspace for each matrix group, we 
will postpone a more explicit discussion of examples. 

Consider a matrix group G c GL(K). Let exp _1 (G) c End V be the inverse image of G under 
exp. Since exp (nA) = (exp A) n , it is clear that exp _1 (G) is closed under scalar multiplication by 
integers. Set 

g = {A 0 End V: exp tA e G for all t g U} = P| *exp -1 (G). 

re U x 

Observe that g is the collection of infinitesimal generators of one-parameter subgroups of G. We 
call g the Lie algebra of G. 


(5.1) 

satisfying the identities 


(i) 


(5.2) 


(ii) 

(hi) 


Theorem 17. (a) The Lie algebra g of a matrix group G is a Lie algebra. 

(b) The map exp : g — > G maps a neighborhood of 0 in g bijectively onto a neighborhood of l v 
in G. 

Remarks, (i) Part (b) of Theorem 17 implies G is locally homeomorphic to Euclidean space. In 
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fact it is not hard to refine part (b) and show that G has the structure of a smooth manifold, such 
that the group multiplication is smooth, but we will not do that here. 

(ii) Theorem 17 provides a geometric picture of the relation between g and G. If a one-parame- 
ter group exp (tA) is regarded as a curve inside the vector space End V, then this curve passes 
through the identity l v at time t = 0. By differentiating the formula for exp tA, we see the tangent 
vector at the point l v to this curve is just A. Thus, as we have defined it, g consists simply of all 
tangent vectors to the curves defined by one-parameter groups in G. But Theorem 17 asserts that 
these tangent vectors actually fill out some linear subspace (namely g) of End V, and further, if we 
make the smooth change of coordinates A -> exp A, then this linear subspace g is bent in such a 
way that it lies entirely in G, and fills up G around l v . In other words, G is shown to be a smooth 
multidimensional surface inside End V, and g is simply its tangent space at the point l v . 

The main burden of the proof of Theorem 17 is carried by the following technical result. 

Lemma 18. Suppose {A n } is a sequence in exp _1 (G), and \\A n \\ — > 0. Let s n be a sequence of real 
numbers. Then any cluster point of s n A n is in g. 

Proof. Let B be the cluster point. By passing to a subsequence if necessary we may assume that 
s n A n converges to B. Fix a number t e R. Let m n be an integer such that \m n — ts n \ < 1. Then 
m n A n converges to tB; for we have 

II m n A n - tB\\ = || (m n - ts n )A n + t(s n A n - £)|| 

^\m n -ts n \\\A„\\ + \t\\\s n A n -B\\ 

<\\A n \\ + \t\\\s n A H -B\\ 

which converges to zero as n -> oo, by our assumptions on A n and B. Since m n A n e exp _1 (G), 
and exp _1 (G) is closed, we see that tB e exp _1 (G). Since t was arbitrary in U, we see that 
B <eq.m 

Proof of Theorem 17. We first show g is a subspace of End V. Since g is by definition closed 
under scalar multiplication, we need only show it is closed under addition. Take A, B e g. Then 
as in Proposition 14 we know that for large enough n 

exp(A/n)exp(B/n) = exp C n , 

where ||C„|| -> 0, and nC n -* A + B. Hence Lemma 18 implies A + 5eg. 

Next we show that if A, B e g, then also [A, B] e g. As in Proposition 15 we know that for 
large n we have 

(exp (A/n) : exp (B/n)) = exp E n 

with E n -> 0 and n 2 E n -» [A, B]. Another application of Lemma 18 says [A, B] e g. This 
concludes part (a) of Theorem 17. 

We know g is a linear subspace of End(K). Let Y c End(F) be a complementary subspace of 
g, so that End V = g © Y. Let p x and p 2 be the projections of End Fong and Y, respectively, 
with respective kernels Y and g. Define a map E : End V — ► GL(F) by 

E{A) = exp(p 1 (A))exp(p 2 (A)). 

By use of Proposition 13, we can compute that 

■J t (exp( j p 1 (M))exp(/> 2 (t4)))| f _ 0 = p x (A) +p 2 (A) = A. 

This says that the differential of E at 0 is the identity map on End V, so that E takes small 
neighborhoods of 0 to neighborhoods of l v bijectively, by the Inverse Function Theorem. Choose 
a small ball ^ r (0) c End V, and suppose exp( ^(0) D g) does not cover a neighborhood of l v in 
G. Then we can find a sequence B n e exp _1 (G) such that B n 0, but B n <£ g. When B n is close 
enough to 0, we may write 


exp B n = E(A n ) 
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for some A n . We will have A n 0 as B n -> 0. Then 

ex P {Pi(A n )) = exp(/? 1 (v4 w )) 1 exp B n 

is also in G, and is nonzero by our assumption on B n . Since A n -> 0, /? 2 (v4„) 0 also. The 

sequence \\p 1 (A n )\\~ 1 p 1 (A n ) will have cluster points, and these must be in g by Lemma 18. On the 
other hand, p 2 (A n ) e Y, so all cluster points must be in Y. This contradicts the fact that Y was 
chosen complementary to g, so statement (b) of Theorem 17 follows. ■ 

Examples. We will describe below the Lie algebras of some of the groups listed at the 
beginning of this section. The verification that the indicated Lie algebras are indeed the Lie 
algebras of the stated groups is left as an exercise. 

(i) The Lie algebra of GL(F) is of course End(K). 

(ii) The Lie algebra of SL„(R) is the space of si W (R) olnXn matrices of trace zero. 

(iii) Let /? be a bilinear form on V. The isometry group of ft is the group of invertible operators 
A such that 

P(Au,Av) = (3(u,v) for all w,*;eF. 

The Lie algebra of this group is the space of operators B such that 

ft(Bu, v) + f}(u, Bv) = 0. 

In particular the Lie algebra o n (V) of the orthogonal group O n (U) of isometries of the 
standard inner product on U n is the space of skew-symmetric matrices. 

(i y ) The Lie algebra of the subgroup of GL( K) of maps commuting with given operators { 7* } 
is the subalgebra of End V commuting with the T t . 1 J 

(v) The Lie algebra of the group P({K,}) of invertible transformations which preserve each 
of the subspaces V t of V is the subalgebra of all transformations which preserve the V t . In 
particular, the Lie algebra of the group of invertible upper triangular matrices is the 
vector space of all upper triangular matrices. 

(vi) The Lie algebra of G x Pi G 2 , for matrix groups G i9 is q 1 Pi g 2 . 

(vii) A matrix group G and its identity component G° have the same Lie algebra. 

After its existence, the second most important feature of g is that it is natural (in the sense of 
category theory). This is the content of our next theorem. 

Let g , 6 , be real Lie algebras. A homomorphism from g to 6 is a linear map 

L : g -> 6 

satisfying 

(5.3) L([x 9 y]) = [Lx, Ly] x,y<=Q. 

Let V, U be real vector spaces. 

Theorem 19. Let G c GL(K) be a matrix group with Lie algebra q. Let <f> : G GL(U) be a 
continuous homomorphism. Then there is a homomorphism of Lie algebras 

(5.4) d<f> : g End £/ 
such that 

(5.5) exp(^<#>(^4)) = <J>(exp^4). 

Proof. If A eg, then exp tA is a one-parameter subgroup of G, so <£(exp(L4)) is a one-parame- 
ter subgroup of <f>(G) c GL(U). Hence by Theorem 10 we may write </>(exp(tA)) = exp(tB) for 
some B e End U. If we define 


d*(A)-B, 
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then equation (5.5) will obviously be satisfied. To prove this theorem, it suffices to show that d<j> is 
a homomorphism of Lie algebras. But this follows directly from Propositions 14 and 15 which 
show that the Lie algebra operations in g are determined by operations in G. 

Example. The formula (4.17) shows that g is an invariant subspace of End V under the 
operators Adg, g e G. The restriction of Adg to g is again denoted by Adg, and the resulting 
action of G on g is still called the adjoint action. In terms of Theorem 19, the formula (4.17) has 
the interpretation 


(5.6) 


J(Ad) = ad. 


An immediate consequence of Theorem 19 is: 

Corollary 20. If G x c GL(V) and G 2 Q GL (U) are isomorphic matrix groups , then their Lie 
algebras Q l and g 2 are isomorphic as Lie algebras. 

Proof. Let <j> : G 1 G 2 be a continuous isomorphism with continuous inverse </> _1 . Then in 
particular <j> is a continuous homomorphism from G x to GL(I/), and <j>~ x is a continuous 
homomorphism from G 2 to GL(F). Theorem 19 therefore provides us with associated Lie algebra 
homomorphisms d<j> and d(<j>~ 1 ). It follows from the definition of the Lie algebra of a matrix 
group and formula (5.5) that in fact d^>(Qi) Q g 2 , and similarly <7(<J>' 1 )(g 2 ) c q 1 . It further 
follows from formula (5.5) that since <J> _1 ° <j> is the identity on G u then also d(<j>~ 1 )° d<j> is the 
identity on q v In other words d (<|> -1 ) = (d<£) _1 , so d<j> is in fact a Lie algebra isomorphism from 
Qi to g 2 . ■ 

The converse of Corollary 20, that groups with isomorphic Lie algebras are isomorphic, is false. 
For example the rotation group 


and the diagonal group 


SO, - 




cos t 
L sin t 


a 0 
0 1 


-sin# 
cos 0 J 




a > 0 


both have Lie algebra isomorphic to R, but S0 2 is homeomorphic to a circle, while D l is 
homeomorphic to K, so they are certainly not isomorphic. 

However, the converse of Corollary 20 is in a sense almost true, so that the bracket operation 
on g almost determines G as a group. After the existence of the Lie algebra, this fact is the most 
remarkable in Lie theory. Its precise formulation is known as Lie’s Third Theorem. It is in proving 
a suitable version of Lie’s Third Theorem that Lie theory begins to get involved, so we will leave 
the story here. Precise treatments of these issues can be found in [A], [Ch], [He], [Se]. 

6. Loose Ends and Further Developments 

In §§3, 4, and 5 we have shown that to each matrix group <7, there is associated in a close and 
natural way, a Lie algebra g, the two being connected via one parameter groups and the 
exponential map. These facts constitute an important part of the foundations of Lie theory. We 
will describe briefly what we have omitted from the standard account. 

First, we have not treated Lie groups as abstract things-in- themselves, but have only dealt with 
them as subgroups of a standard group, GL(K). We could not have discussed abstract Lie groups 
without assuming the standard language of differentiable manifolds. Our approach allowed us to 
bring to the fore the remarkable Theorem 17, which asserts that merely the requirements of being 
closed and being a group inside GL(K) (or any Lie group) suffices to make the group a smooth 
manifold. This indicates what a strong regularity condition the group property is. Research over 
the past decades have continued to underscore this theme [BT], [Ma], [Mo]. 
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Second, we have not demonstrated how complete and mutual is the relationship between Lie 
groups G and their Lie algebras g. It is in this direction that the principal technical complications 
of the theory He. For example, although we have shown how to attach a Lie algebra to every 
matrix group, we have not tried to attach a group to every Lie subalgebra of End V. Indeed, this is 
not possible if one sticks to matrix groups; the one parameter groups obtained by exponentiating 
elements in a given Lie algebra g will generate a group which in a suitable sense has g as its Lie 
algebra but this group will not always be closed in GL(F). The simplest example is probably the 
one-parameter group exp tA x in GL 4 (R), where 


A 


X 


0 1 0 0 
-10 0 0 
0 0 Ox 

0 0 -X 0_ 


and x is any irrational number. Also the question of the relation of two matrix groups which have 
isomorphic Lie algebras, essentially the question of the converse of Corollary 20, involves the 
notion of covering space and fundamental group [Ms] and is beyond the scope of this discussion. 
Interestingly enough, both these questions are most vexed for the most simple-minded case: 
abehan Lie groups and their Lie algebras. We close these brief remarks by pointing out that, when 
G is fairly nonabehan, especially if the center of G is discrete, the existence of the adjoint action 
and formula (5.6) in particular go a long way toward showing that G is nearly determined by g. 
After the foundations comes the rather extensive development of the structure theory of Lie 
algebras, with direct consequences for the groups. Several fine accounts of the theory of Lie 
algebras are available, for example [J], [Hu]. Beyond the theory of Lie groups and algebras in 
themselves Hes the vast domain of their apphcations. We have mentioned a few of these in the 
introduction and in §7. Some representative references for apphcations are [BC], [HP], [Hr], [Ko], 

[Lo]. 

Our treatment in §§3, 4, 5 has been concrete in that we worked only inside End V, but it was 
also abstract in that it was coordinate free. We record here some common terminology used when 
bases are introduced. Let g c End Kbe a Lie subalgebra. Let {y t ), 1 < i < dim g be a basis for g. 
Then the fact that g is a Lie algebra amounts to the statement that the commutators [y i9 yj] are 
again Hnear combinations of the y k 9 s. Thus we have equations 


( 6 . 1 ) 


[yn yj] = 


where the are real numbers. The equations (6.1) are called the commutation relations of they/s 
and the cfj are called the structure constants of g with respect to the y t . 

For example, set 


0 

r 

, e~= [° 

o' 

, h = 

1 

o' 

.0 

0 . 

Li 

0 . 


.0 

- 1 . 


The matrices e + , e , and h form a basis for si 2 , the 2 X 2 traceless matrices, one of the most 
fundamental Lie algebras. It is easy to compute that their commutation relations are 


[h, e + ] = 2e + [h, e ] = —2e [e + ,e ] = h. 


1 . Relations with the Standard Curriculum 

In this section we give some examples of how Lie theory makes contact with current staples of 
undergraduate mathematics. We must of course be very restrictive and brief. 

1. Many of the standard theorems of Hnear algebra are of course also part of the fabric of Lie 
theory, and gain coherence when considered in that Hght. For example, several of the standard 
canonical forms, e.g., Jordan form, the diagonalization of the (skew) Hermitian matrices, amount 
to classification of the conjugacy classes (orbits under the adjoint action) in a Lie algebra. Jordan 
form describes conjugacy classes in End(C”) ~ gl„(C), and diagonalization of Hermitian matrices 
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describes conjugacy classes in U n , the n X n unitary group. We cannot explain this interpretation 
of these results in detail, but encourage the reader to explore it by further reading. 

Also, SL n (R) or SL„(C) are examples of an extremely important class of Lie groups called 
semisimple groups, and several well-known results in linear algebra are special cases for SL„(R) or 
SL W (C) of structure theorems for semisimple groups. (Since GL„ and SL W are so similar, we state 
the results for GL„.) The polar decomposition or singular value decomposition [St] says that any 
A e GL„(R) may be written in the form 

A = OS = 0 x D0 2 , 


where O, 0 lf and 0 2 are orthogonal matrices, S is symmetric, and D is diagonal with positive 
entries. This is the specialization to GL„(R) of what is known as the Cartan decomposition [He] in 
the context of semisimple Lie groups. Also, the Gram-Schmidt orthonormalization procedure [St] 
says, in group-theoretical terms, that any A e GL„(R) may be written in the form 

A = OB = ODU , 

where O is orthogonal, B is upper triangular, D is diagonal with positive entries, and U is upper 
triangular with diagonal entries all equal to 1. For general semisimple groups, this is known as the 
Iwasawa decomposition [He]. 

Various basic features in the elimination theory, including the “LU factorization” [St] of a 
generic matrix into the product of an upper triangular and a lower triangular matrix, and the 
“reduced row-echelon form” [DN] are aspects of a different kind of decomposition of semisimple 
groups, known as the Bruhat decomposition [Bo]. 


2. The cross product on R 3 defines a Lie algebra structure on IR 3 . This is in fact isomorphic to 
o 3 , the Lie algebra of 0 3 , the 3X3 skew symmetric matrices. The isomorphism is accomplished 

by 


(x, y. 


2) 


0 -x -y 
x 0 -z . 
y z 0 


The generalization of this correspondence to higher dimensions leads to the theory of spinors and 
Clifford algebras [J2]. 

3. The fact that second mixed partial derivatives are equal is a reflection of the fact that is 
an abelian Lie group. 

4. The theory of Fourier series and Fourier transform is best understood group-theoretically. 
See [Gr] for a discussion. 


5. It is fairly routine in quantum mechanics courses, in conjunction with the Schrodinger 
equation for the hydrogen atom and angular momentum, to introduce, “raising and lowering 
operators” [Me]. The operators belong to the complexification of o 3 , which is isomorphic to 
sl 2 (C). The commutation relations of the Lie algebra figure importantly in the computations. The 
harmonic oscillator is also susceptible to a Lie-theoretic treatment. The Canonical Commutation 
Relations themselves are the laws for a bracket relation on a Lie algebra, known as the Heisenberg 
Lie algebra [Ca], [Ho]. The relations of this algebra with quantum mechanics, and physics 
generally, is deep and extensive. 

6. Perhaps the part of standard undergraduate mathematics that is pedagogically most 
compatible with Lie theory is differential equations. We have already discussed in §2 how the 
notion of one-parameter group is a geometrization of the solution of a system of differential 
equations. And in §3 we noted that one-parameter groups of linear transformations were 
associated with the very important class of linear, constant coefficient systems. Indeed, the 
exponential map and linear algebra techniques are often explicitly used in treating these systems 
[Br]. 
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Many of the important classical differential equations are related with Lie theory. Indeed much 
of the theory of special functions may be considered a branch of Lie theory [Mi], [V]. Below I 
state, always by way of example, some exercises which I have given to students in differential 
equations courses and which were favorably received. 

A.(i) Let P, Q and the identity operator I span a Lie algebra, with commutation relations 
[P, Q] = /, and of course [P, I] = [ Q , I] = 0. (These are the Canonical Commutation Relations.) 
Define L = (P - I)QP , and A n = (P - I) n Q n (so L = A X P). Show that 

(a) [Q,{P -/)"] = -n(P-iy~\ 

O) A n+1 = (A x + n)A„, 

(c) [L, ] = L - A u and 

( d ) L(A X + n) = (A 1 + «)L +(L + n) —(A 1 + n ). 

(ii) Suppose v n is an eigenvector for L, with eigenvalue — n, so that (L + h)p„ = 0. Show from 
(d) above that (A x + n)v n is an eigenvector for L, with eigenvalue ~(n + 1). Conclude from (b) 
that if v 0 is an eigenvector of L with eigenvalue 0, then A n v 0 = v n is an eigenvector with eigenvalue 
— n. 


(iii) Show that if P = d/dx and Q = multiplication by x, then P and Q satisfy the relations 
above. Show also that 

e x 4~e~ x = P- I. 

dx 

Conclude that a solution to the Laguerre equation zy" -f (1 - z)y f + ny = 0 is 

e x i^\e- x x n ) = {P-I) n Q n {l): 

here 1 is the constant function on U. 

B.(i) Take P, Q and I as in A(i). Suppose Pv Q = 0, and set v n = Q n (v 0 ). Show inductively that 
Pv n = nv n _ i. Conclude that v n is an eigenvector of eigenvalue n for QP. 

(ii) Put P = d/dx, Q = (d/dx) + x. Verify that these satisfy the correct commutation rela- 
tions, and show that 

Q = e -V2 ^ 

* dx 

Show that solutions of Hermite’s equation/" + xy' — ny = 0 are given by 

In addition to Rodrigues-type formulas such as the above, one can deduce in a purely formal 
manner recursion relations and other properties of the Hermite, Laguerre, Legendre, Bessel, and 
many other classical families of functions. 

I would like to thank Kenneth Gross for painstaking efforts to improve the readability of this paper. 
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